metaTED: a Corpus of Metadiscourse for Spoken Language
نویسندگان
چکیده
This paper describes metaTED – a freely available corpus of metadiscursive acts in spoken language collected via crowdsourcing. Metadiscursive acts were annotated on a set of 180 randomly chosen TED talks in English, spanning over different speakers and topics. The taxonomy used for annotation is composed of 16 categories, adapted from Ädel (2010). This adaptation takes into account both the material to annotate and the setting in which the annotation task is performed. The crowdsourcing setup is described, including considerations regarding training and quality control. The collected data is evaluated in terms of quantity of occurrences, inter-annotator agreement, and annotation related measures (such as average time on task and self-reported confidence). Results show different levels of agreement among metadiscourse acts (α ∈ [0.15; 0.49]). To further assess the collected material, a subset of the annotations was submitted to expert appreciation, who validated which of the marked occurrences truly correspond to instances of the metadiscursive act at hand. Similarly to what happened with the crowd, experts revealed different levels of agreement between categories (α ∈ [0.18; 0.72]). The paper concludes with a discussion on the applicability of metaTED with respect to each of the 16 categories of metadiscourse.
منابع مشابه
A Comparative Study of Metadiscourse Markers in English and Persian University Lectures
The purpose of this study was to compare metadiscourse markers in forty English and Persian university lectures. Twenty of them were selected from the British Academic Spoken English corpus. The other 20 were selected from an Iranian website (www.maktoobkhane.com). We used Hyland’s (2005) model of metadiscourse. The metadiscourses were collected. Further, the frequency of each type was studied....
متن کاملMetadiscourse Markers in a Corpus of Learner Language: The Case of Iranian EFL Learners
Different issues have been probed in learner corpus research since the late 1980s.However, taking the im- portance of meta discourse markers (MDMs) in signposting academic discourse, their use in Iranian EFL learners‟ academic essays is an area of research in need of a more serious analysis. Contributing to this line of investigation, this paper reports a corpus-based study of the use of MDMs i...
متن کاملLexical Level Distribution of Metadiscourse in Spoken Language
This paper targets an understanding of how metadiscourse functions in spoken language. Starting from a metadiscourse taxonomy, a set of TED talks is annotated via crowdsourcing and then a lexical grade level predictor is used to map the distribution of the distinct discourse functions of the taxonomy across levels. The paper concludes showing how speakers use these functions in presentational s...
متن کاملMetadiscourse in Applied Linguistics and Chemistry Research Article Introductions
This study examined disciplinary rhetoric in research articles, focusing on different traditions in structuring text discourses from a metadiscourse-move analytic approach. The corpus consisted of 72 research article Introductions (RAIs): 36 in applied linguistics and 36 in chemistry. Swales’ CARS model (1990, 2004) and Hyland’s interpersonal model of metadiscourse (2005) were used as analytica...
متن کاملVague Language and Interpersonal Communication: An Analysis of Adolescent Intercultural Conversation
This paper is concerned with the analysis of the spoken language of teenagers, taken from a newly developed specialised corpus the British and Taiwanese Teenage Intercultural Communication Corpus (BATTICC). More specifically, the study employs a discourse analytical approach to examine vague language in an intercultural context among a group of British and Taiwanese adolescents, paying particul...
متن کامل